-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Give LocalBitcoinNode a timeout in case BitcoinJ hangs during handshake attempt #4058
Give LocalBitcoinNode a timeout in case BitcoinJ hangs during handshake attempt #4058
Conversation
The fix is such that if BitcoinJ hangs (like @wiz was sometimes experiencing), a 5 second timeout will get triggered. This 5 second prolonging isn't ideal, but the other alternative is diving into BitcoinJ for patching. That would be difficult, because I can't reproduce the issue myself. Another alternative is reducing the timeout. Even a 2 seconds timeout shouldn't get triggered under normal circumstances. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tested this PR and it does prevent my issue #4057 from reproducing. However, I feel I should NACK this PR because of the multiple issues this feature has introduced (including #4052), and I personally feel we should revert the feature for now and re-evaluate it (with heavy testing) before merging it back into master. The most important thing is to keep Bisq stable and we must be risk averse when it comes to these types of changes.
That was my first intention, but the problem with reverting this feature is that unfortunately too much code touching this parts of the codebase occurred, so I can't just revert it by the GitHub feature. So it would also need lots of manual code changes which are also error-prone as well. I agree that keeping Bisq stable is key, so I think for the future we should also discuss having more devs than just two involved in release testing. In our last three releases we always had issues based on very specific OS setups. That can only be detected if lots of people are at least starting and smoke testing new releases. In short for the v1.2.9 update I think we need to go with the best local bitcoin node detection solution we can come up with. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK
I think, like @ripcurlx, that we should merge this PR since it's fixing something that's wrong in the current code base. If we decide to not release the local bitcoin node detection we'll make a separate revert commit later on. I won't merge until at least one more person gives an ACK or @wiz changes his mind though. @wiz this PR doesn't affect the inclusion of local bitcoin node in the next release. If that's your reason for NACK I think that's the wrong way to suggest reverting the entire feature. |
@dmos62 When running with a local full node I recognized that I always the following exception log in the console:
The connection is established successfully to the node, but I did get this exception. |
Those should be some of the expected exceptions I have not muted. I'll double check and submit PR to mute them. |
So the misguiding exceptions pointed out by @ripcurlx are a more difficult problem than I expected. To be clear, they are harmless. They are thrown because we close the Peer while it's still doing something in parallel. The "in parallel" bit is what complicates things. I'll continue investigating. Should also lead to a better understanding of the edge cases. Can't guarantee a solution for Monday. We can consider, as a temporary measure, a log entry preceding these exceptions that explains them. Edit: solved; see mentioned PR. |
Replaces NioClient and NioClientManager with BlockingClient, which gets rid of misguiding log entries as reported in bisq-network#4058 PR's comments by @ripcurlx. Also introduces a few minor changes including better logs and clearer code.
I've now also tested the two configuration errors again on Regtest and it is working as before 👍 |
Ups wrong PR 😉 |
Fixes #4057
@wiz discovered (#4057) than in some cases LocalBitcoinNode will hang during Bisq setup. This patch gives LocalBitcoinNode's handshake attempt a timeout. Specifically, the part where we block while waiting for BitcoinJ to attempt to return a local BTC node's VersionMessage. I had not originally implemented a timeout, because I was under the impression that in such a case the NioClient would timeout (which does have a timeout).
@wiz has tested this patch and verified that with this patch Bisq handles BitcoinJ hanging gracefully (by timing out the relevant future after 5000 ms).
This patch also prints an error in case a timeout happens, because it normally shouldn't.
The underlying cause is unknown, but seems to come from BitcoinJ.